Training-free open-vocabulary monocular 3D object detection for industrial assets

Masoud Kamali; Behnam Atazadeh; Abbas Rajabifard; Yiqun Chen

Journal article

Training-free open-vocabulary monocular 3D object detection for industrial assets

Masoud Kamali, Behnam Atazadeh, Abbas Rajabifard, Yiqun Chen

Automation in Construction | Elsevier | Published : 2026

DOI: 10.1016/j.autcon.2026.106841

Open access

Download PDF

Abstract

3D scene understanding in brownfield industrial environments plays a critical role in operation and maintenance activities. However, the absence of reliable 3D information poses significant challenges for accurate object detection. Existing 3D object detection methods rely on labelled datasets, which are scarce in industrial contexts given the complexity and diversity of assets. In addition, current methods often require depth information and camera parameters for 2D-to-3D transformation. To address these limitations, this paper proposes a training-free open-vocabulary monocular 3D object detection approach that eliminates the reliance on labelled datasets, depth information, and camera para..

View full abstract